Whistler: a trainable text-to-speech system

نویسندگان

  • Xuedong Huang
  • Alex Acero
  • J. Adcock
  • Hsiao-Wuen Hon
  • J. Goldsmith
  • Jingsong Liu
  • Mike Plumpe
چکیده

We introduce Whistler, a trainable Text-to-Speech (TTS) system, that automatically learns the model parameters from a corpus. Both prosody parameters and concatenative speech units are derived through the use of probabilistic learning methods that have been successfully used for speech recognition. Whistler can produce synthetic speech that sounds very natural and resembles the acoustic and prosodic characteristics of the original speaker. The underlying technologies used in Whistler can significantly facilitate the process of creating generic TTS systems for a new language, a new voice, or a new speech style.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recent improvements on Microsoft's trainable text-to-speech system-Whistler

Whistler Text-to-Speech engine was designed so that we can automatically construct the model parameters from training data. This paper will focus on recent improvements on prosody and acoustic modeling, which are all derived through the use of probabilistic learning methods. Whistler can produce synthetic speech that sounds very natural and resembles the acoustic and prosodic characteristics of...

متن کامل

Recent Improvements on Michael’s Trainable Sample Paper System - Whistle

Whistler Text-to-Speech engine was designed so that we can automatically construct the model parameters from training data. This paper will focus on recent improvements on prosody and acoustic modeling, which are all derived through the use of probabilistic learning methods. Whistler can produce synthetic speech that sounds very natural and resembles the acoustic and prosodic characteristics of...

متن کامل

Automatic generation of synthesis units for trainable text-to-speech systems

Whistler Text-to-Speech engine was designed so that we can automatically construct the model parameters from training data. This paper will describe in detail the design issues of constructing the synthesis unit inventory automatically from speech databases. The automatic process includes (1) determining the scaleable synthesis unit which can reflect spectral variations of different allophones;...

متن کامل

Reducing the footprint of the IBM trainable speech synthesis system

This paper presents a novel approach for concatenative speech synthesis. This approach enables reduction of the dataset size of a concatenative text-to-speech system, namely the IBM trainable speech synthesis system, by more than an order of magnitude. A spectral acoustic feature based speech representation is used for computing a cost function during segment selection as well as for speech gen...

متن کامل

Phrase splicing and variable substitution using the IBM trainable speech synthesis system

This paper describes a phrase splicing and variable substitution system which offers an intermediate form of automated speechproduction lying in-between the extremes of recorded utterance playback and full Text-to-Speech synthesis. The system incorporates a trainable speech synthesiser and an application specific set of pre-recorded phrases. The text to be synthesised is converted to a phone se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996